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Abstract. The number of frequencies of factors of length n + 1 in a recurrent aperiodic infinite 
^ , word does not exceed 3AC(n), where AC(n) is the first difference of factor complexity, as shown 

by Boshernitzan. Pelantova together with the author derived a better upper bound for infinite 
fT^ ' words whose language is closed under reversal. In this paper, we further diminish the upper 

bound for uniformly recurrent infinite words whose language is invariant under all elements of 
a finite group of symmetries and we prove the optimality of the obtained upper bound. 
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1. Introduction 



When studying factor frequencies, Rauzy graph is a powerful tool. Using this tool, the 
following results have been obtained. Dekking in [8] has described factor frequencies of two 
famous infinite words - the Fibonacci word and the Thue-Morse word. Using Rauzy graphs, 
it is readily seen that frequencies of factors of a given length of any Arnoux-Rauzy word over 
an m-letter alphabet attain at most m + 1 distinct values. Explicit values of factor frequencies 
have been derived by Berthe in [4] for Sturmian words and by Wozny and Zamboni in [16] for 
r~^ I Arnoux-Rauzy words in general. 

■^ ■ Queffelec in [14] has explored factor frequencies of fixed points of morphisms from another 

^^ i point of view - as a shift invariant probability measure. She has provided a rather complicated 

t^ ' algorithm for the computation of values of such a measure. For some special classes of fixed 

points of morphisms (circular marked uniform morphisms), Frid [10] has described completely 
their factor frequencies. 

A simple idea concerning Rauzy graphs lead Boshernitzan [5] to an upper bound on the 
number of different factor frequencies in an arbitrary recurrent aperiodic infinite word. He 
K> I has shown that the number of frequencies of factors of length n + 1 does not exceed 3AC(n), 

j^ ■ where AC{n) is the first difference of factor complexity. In [6], it has been shown that AC(ra) 

is bounded for infinite words with sublinear complexity (for instance, fixed points of primitive 
substitutions is a subclass of infinite words with sublinear complexity), therefore the number of 
different frequencies of factors of the same length is bounded. 

In our previous paper [2], making use of reflection symmetry of Rauzy graphs, we have dimin- 
ished Boshernitzan's upper bound for infinite words whose language is closed under reversal. 

This time, we generalize our result to infinite words whose language is invariant under all 
elements of a group of symmetries and whose Rauzy graphs are therefore invariant under all 
elements of a group of automorphisms. In Section 2, we introduce basic notions, describe the 
main tool of our proofs ~ reduced Rauzy graphs - and summarize in detail the known upper 
bounds on the number of factor frequencies. Section 3 explains what is to be understood under 
a symmetry. In Section 4, we prove Theorem 4.1, which provides an optimal upper bound on 
the number of factor frequencies of infinite words whose language is invariant under all elements 
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a finite group of symmetries. Section 5 is devoted to tlie demonstration that tlie upper bound 
from the main theorem is indeed optimal. 

Finally, let us mention that the idea to exploit symmetries of the Rauzy graph was already 
used in [3] in order to estimate the number of palindromes of a given length, and, recently, it 
has been used profoundly in [12, 13, 15] for the generalization of the so-called rich and almost 
rich words (see [11]) for languages invariant under more symmetries than just reversal. 

2. Preliminaries 

An alphabet ^ is a finite set of symbols, called letters. A concatenation of letters is a word. 
The length of a word w is the number of letters in w and is denoted \w\. The set A* of all 
finite words (including the empty word e) provided with the operation of concatenation is a free 
monoid. The set of all finite words but the empty word e is denoted A'^. We will also deal with 
right-sided infinite words u = uqUiU2..., where Ui & A. A finite word w is called a factor of the 
word u (finite or infinite) if there exist a finite word p and a word s (finite or infinite) such that 
u = pws. The factor p is a prefix of u and s is a suffix of u. An infinite word u is said to be 
recurrent if each of its factors occurs infinitely many times in u. An occurrence of a finite word 
w in a finite word v = viV2 . . . Vm (in an infinite word u) is an index i such that tt; is a prefix of 
the word fifi+i . . . Vm (of the word Ujiij+i . . . ). An infinite word u is called uniformly recurrent 
if for any factor w the set {j — i \ i and j are successive occurrences of w in u} is bounded. 

2.1. Complexity and special factors. The language C{u) of an infinite word u is the set of 

all factors of u. We denote by £n(u) the set of factors of length n of u. We define the factor 
complexity (or complexity) of u as the mapping C : N — )■ N which associates to every n the 
number of different factors of length n of u, i.e., C{n) = ^Cniu). 

An important role for the computation of factor complexity is played by special factors. We 
say that a letter a is a right extension of a factor w G /3(u) if wa is also a factor of u. We denote 
by Rext(w) the set of all right extensions of w in u, i.e., Rext(w) = {a G ^ | wa G -C(u)}. If 
#Rext(w) > 2, then the factor w is called right special (RS for short). Analogously, we define 
left extensions, Lext(w), left special factors (LS for short). Moreover, we say that a factor w is 
bispecial (BS for short) if w is LS and RS. 

With these notions in hand, we may introduce a formula for the first difference of complexity 
AC(n) = C{n + I) - C{n) (taken from [7]). 

(1) AC(n) = Y^ (#Rext(w)-l) = ^ (#Lext(w) - l), n G N. 

2.2. Morphisms and antimorphisms. A mapping (p on A* is called 

• a morphism if (p(vw) = ip{v)ip{w) for any v,w & A*, 

• an antimorphism if ip{vw) = ip{w)ip{v) for any v,w £ A*. 

We denote the set of all morphisms and antimorphisms on A* by AM (A*). Together with 
composition, it forms a monoid (the unit element is the identity mapping Id). The mirror (also 
called reversal) mapping R defined by R{wiW2 ■ ■ ■ Wm-iWm) = WmWm-i ■ ■ ■ W2W1 is an involutive 
antimorphism, i.e., R^ = Id. It is obvious that any antimorphism is a composition of R and 
a morphism. 

A language C{u) is closed (invariant) under reversal if for every factor w G C{u), also its 
mirror image R{w) belongs to C{u). A factor w which coincides with its mirror image R{w) is 
called a palindrome. More generally, a language C{u) is closed (invariant) under an antimor- 
phism or morphism ^ G AM{A*) if for every factor w G C^u), also ^(w) belongs to £(u). If 9 
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is an antimorphism on A* , then w = 6{w) is called a 9 -palindrome. It is not difficult to see that 
an infinite word whose language is closed under an antimorphism of finite order is recurrent. 

We define the 9-palindromic complexity of the infinite word u as the mapping "Pg : N — )■ N 
satisfying Vein) = #{11} G £n(u) | w = O^w)}. If 6 = R, we write V{n) instead of Vr^ti). 
Clearly, V{n) < C{n) for all n G N. A non-trivial inequality between V{n) and C{n) can be 
found in [1]. Here, we use a result from [3]. 

Theorem 2.1. // the language of an infinite word is closed under reversal, then for all n G N, 
we have 

(2) V{n) + V{n + 1) < AC(n) + 2. 

This result has been recently generalized in [13]. 

Theorem 2.2. Let G C AM (A*) be a finite group containing an antimorphism and let u be 
an infinite word whose language is invariant under all elements of G. If there exists an integer 
iV G N such that any factor of u of length N contains all letters of A, then 

Yl {Ve{n) + Ve{n + l)) < AC{n) + #G foralln>N, 

where G^"^' is the set of involutive antimorphism in G. 

Remark 2.3. Using Remark 23 from [13], the assumption on N in Theorem 2.2 can he replaced 
with the following weaker assumption: there exists an integer N such that 

(1) for any two antimorphisms 61,62 G G, it holds 

61 j^ 62 =^ 6i{v) / 6*2 (v) for any v with \v\ > N, 

(2) and for any two morphisms '^\,^2 £ G, it holds 

V'l 7^ 9^2 =^ V^il^) 7^ '-P^iy) for any v with \v\ > N. 

Ifu is an infinite word whose language is closed under reversal, i.e., invariant under a morphism 
and an antimorphism of G = {Id,R}, then the above weaker assumption is satisfied already for 
N = 0. Therefore, Theorem 2.1 is indeed a particular case of Theorem 2.2. 

2.3. Factor frequency. If it; is a factor of an infinite word u and if the following limit exists 

1- #|occurrences of w in v} 
lim — , 

\v\-^oo,veC{u) \v\ 

then it is denoted by p{w) and called the frequency of w. 

Let us recall a result of Frid [10], which is useful for the calculation of factor frequencies in 
fixed points of primitive morphisms. In order to introduce the result, we need some further 
notions. Let c^ be a morphism on A* = {ai,a2, ■ ■ ■ ,am}*- We associate with 99 the incidence 
matrix M^p given by [M<^]jj = |(/3(aj)|a-, where |(y3(aj)|a- denotes the number of occurrences of Oj 
in (p{aj). The morphism if is called primitive if there exists /c G N satisfying that the power M^ 
has all entries strictly positive. As shown in [14], for fixed points of primitive morphisms, 

• factor frequencies exist, 

• it follows from the Perron- Frobenius theorem that the incidence matrix has one dominant 
eigenvalue A, which is larger than the modulus of any other eigenvalue, 

• the components of the unique eigenvector {x\,X2, • • • , Xm)^ corresponding to A normal- 
ized so that Y^=\Xi = 1 coincide with the letter frequencies, i.e., Xj = p{ai) for all 
iG{l,2,...,m}. 
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Let (f he a niorphisni on A*. We denote ipij : A~^ — t- A'^, where i,j G N, the mapping that 
associates with v G A'^ the word ipij{v) obtained from ip{v) by erasing i letters from the left 
and j letters from the right, where i + j < \(p{v)\. We say that a word v £ A'^ admits an 
interpretation s = {bobi . . .bm,i,j) if w = ipij {bobi . . . bm) , where bi £ A and i < \ip{bo)\ and 
j < \'~p{bm)\- The word a{s) = b^bi ... 6m is an ancestor of s. The set of all interpretations of v 
is denoted I{v). Now we can recall the promised result of Frid [10]. 

Proposition 2.4. Let (p be a primitive morphism having a fixed point u and let X be the dominant 
eigenvalue of the incidence matrix M^. Then for any factor v £ 'C(u), it holds 

p{v) = jY^ P(a(s))- 

sG/(d) 

2.4. Reduced Rauzy graphs. Assume throughout this section that factor frequencies of in- 
finite words in question exist. The Rauzy graph of order n of an infinite word u is a directed 
graph r„ whose set of vertices is /2n(u) and set of edges is £n+i(u). An edge e = wqWi . . .Wn 
starts in the vertex w = wqWi . . . Wn~i, ends in the vertex v = wi . . . Wn-iWn, and is labeled by 
its factor frequency p(e). 

It is easy to see that edge frequencies in a Rauzy graph r„ behave similarly as the current in 
a circuit. We may formulate an analogy of Kirchhoff 's current law: the sum of frequencies of 
edges ending in a vertex equals the sum of frequencies of edges starting in this vertex. 

Observation 2.5 (Kirchhoff's law for frequencies). Let w be a factor of an infinite word u 
whose factor frequencies exist. Then 

P{w) = ^ p{aw) = ^ p{wa). 

aGLcxt(w) aGRcxt(w) 

Kirchhoff's law for frequencies has some useful consequences. 

Corollary 2.6. Let w be a factor of an infinite word u whose frequency exists. 

• If w has a unique right extension a, then p{w) = p{wa) . 

• If w has a unique left extension a, then p{w) = p{aw). 

Corollary 2.7. Let w be a factor of an aperiodic recurrent infinite word u whose frequency 
exists. Let v be the shortest BS factor containing w, then p{w) = p{v). 

The assumption of recurrence and aperiodicity in Corollary 2.7 is needed in order to ensure 
that every factor can be extended to a BS factor. 

Corollary 2.6 implies that if a Rauzy graph contains a vertex w with only one incoming edge 
aw and one outgoing edge wb, then p := p{aw) = p{w) = p{wb) = p{awb). Therefore, we can 
replace this triplet (edge-vertex-edge) with only one edge awb keeping the frequency p. If we 
reduce the Rauzy graph step by step applying the above described procedure, we obtain the 
so-called reduced Rauzy graph r„, which simplifies the investigation of edge frequencies. In order 
to precise this construction, we introduce the notion of a simple path. 

Definition 2.8. Let r„ be the Rauzy graph of order n of an infinite word u. A factor e of 
length larger than n such that its prefix and its suffix of length n are special factors and e does 
not contain any other special factors is called a simple path. We define the label of a simple path 
e as p{e). 

Definition 2.9. The reduced Rauzy graph r„ of u of order n is a directed graph whose set of 
vertices is formed by LS and RS factors of Cn^u) and whose set of edges is given in the following 
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way. Vertices w and v are connected with an edge e if there exists in r„ a simple path starting 
in w and ending in v. We assign to such an edge e the label of the corresponding simple path. 

For a recurrent word u, at least one edge starts and at least one edge ends in every vertex of 
r„. If u is moreover aperiodic, then all its Rauzy graphs contain at least one LS and one RS 
factor. It is thus not difficult to see that for recurrent aperiodic words, the set of edge labels in 
r„ is equal to the set of edge labels in the reduced Rauzy graph r„. The number of edge labels 
in the Rauzy graph ^,^ is clearly less or equal to the number of edges in r„. Let us calculate the 
number of edges in r„ in order to get an upper bound on the number of frequencies of factors 
in £„+i(u). 

For every RS factor w G £„(u), it holds that 7^Rext(w) edges begin in w and for every LS 
factor V G >C„(u) which is not RS, only one edge begins in v, thus we get the following formula 

(3) #{e I e edge in f„} = ^ #Rext(w) + J^ 1- 

w RS in Cn{u) V LS not RS in -Cn(u) 

We rewrite the first term using (1) and the second term using the definition of BS factors in the 
following way 

(4) #{e I e edge in f„} = AC(n) + ^ 1 + ^ 1 - ^ L 

V RS in £„(u) v LS in £n{u) v BS in /C„(u) 

Since #Rext(w) — 1 > 1 for any RS factor w and, similarly, for LS factors, we have 

(5) #{w G £„(u) I w RS} < AC{n) and #{w G £„(u) \ w LS} < AC(n). 
By combining (4) and (5), we obtain 

(6) #{e I e edge in f„} < 3AC(n) - X, 

where X is the number of BS factors of length n. This provides us with the result initially 
proved by Boshernitzan in [5]. 

Theorem 2.10. Let u be an aperiodic recurrent infinite word such that the frequency p{w) exists 
for every factor w G C{y\.). Then for every n G N, it holds 

#{p(e)|eG£„+i(u)} < 3AC(n). 

In the paper [2], we have considered infinite words with language closed under reversal and 
we have lowered the upper bound from Theorem 2.10 for them. 

Theorem 2.11. Let u be an infinite word whose language C{u) is closed under reversal and 
such that the frequency p{w) exists for every factor w G C{u). Then for every n £ 'N, we have 

(7) #{p(e) I e G £„+i (u)} < 2AC(n) + 1 - Ix - ^Y, 

where X is the number of BS factors of length n and Y is the number of palindromic BS factors 
of length n. 

Corollary 2.12. Let u be an infinite word whose language C{u) is closed under reversal and 
such that the frequency p{w) exists for every factor w G /^(u). Then the number of distinct 
factor frequencies obeys for all n G N, 

(8) #{/?(e) I e G £„+i(u)} < 2AC(n) + 1, 
where the equality is reached if and only if u is purely periodic. 
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As shown by Ferenczi and Zamboni [9], m-'iet words attain the upper bound from (7) for 
all n G N. Since Sturmian words are 2-iet words, they reach the upper bound in (7) for all 
n G N, too. Consequently, the upper bound from (7) is optimal and cannot be improved while 
preserving the assumptions. However, as we will show in the sequel, if the language of an 
infinite word u is invariant under more symmetries, the upper bound from (7) may be lowered 
considerably. 

3. Symmetries preserving factor frequency 

We will be interested in symmetries preserving in a certain way factor occurrences in u, and 
consequently, frequencies of factors of u. Let us call a symmetry on A* any mapping ^ satisfying 
the following two properties: 

(1) ^ is a bijection: A* ^ A* , 

(2) for ah w,v £ A* 

^{occurrences of w m v} = ^^joccurrences of ^{w) in ^(v)}. 

Theorem 3.1. Let ^ : A* — )■ A* . Then ^ is a sym,m,etry if and only if ^ is a morphisni or an 
antiniorphism such that ^ is a letter permutation when restricted to A. 

The proof of Theorem 3.1 is obtained when putting together the following two lemmas. 

Lemma 3.2. Let ^ be a symmetry on A* and let w G A* . Then \^{w)\ = \w\. 

Proof. Since ^{occurrences of ^{w) in ^(e)} = ^{occurrences of ti; in e} = for every w G 
A*, it follows that ^(e) = e. 

Since ^ is a bijection, for every letter a G ^, there exists a unique w £ A* such that ^{w) = a, 
where w ^ e. If we denote A = {ai, . . . , am}, then using Property (2), it is easy to show that 
there exists a permutation vr G Sm such that ^(a^) = a7r(fc) for all k G {1, . . . , m}. 

Let us now take an arbitrary w G A* , then using the fact that ^ restricted to ^ is a letter 
permutation and applying Property (2), we have 

\w\ = y, ^{occurrences of a in w} = 2. ^{occurrences of ^(a) in "^{w)} = |^(ttj)|. 

D 

Using Lemma 3.2 and the definition of symmetry, it is seen for every wiW2 ■ ■ -Wn G A*, 
Wi G A, that the following equation is valid 

(9) ^{WIW2 ...Wn) = ^K(l))*(Wa(2)) • • • ^(Wa(n)) 

for some permutation a G S'„. The next lemma claims that the permutation a is necessarily 
either the identical permutation (12 ... n) or the symmetric permutation (n ... 21). 

Lemma 3.3. Let ^ be a symmetry on A* . Then ^ is either a morphism or an antiniorphism. 

Proof. We have to prove that ^(ttj) = "^ {wi)'i' {W2) ■ ■ ■ ^(tfn) for every w = W1W2 ■ ■ -Wn G A* , 
Wi G A, or "^(w) = "^{wn) ■ ■ ■ "^ {w2)'i' (wi) for every w = W1W2 . . . Wn & A* , Wi £ A. 

Let us proceed by induction on the length n of w. The case n = 1 is clear. Suppose that 
^(w) = ^(tt;i)^(tt)2) . . . "^(wn-i) for every w = W1W2 ■ ■ ■ Wn-i G A* of length n — 1, n > 2. 
Take an arbitrary word w = W1W2 ■ ■ -Wn G A*. Then, as ^ is a symmetry, ^{w2 ■ ■ -Wn) is 
a factor of ^{wiW2 ■ ■ ■ Wn), in more precise terms, ^{w2 ■ ■ ■ Wn) is either a prefix or a suffix of 
^{wi . . . Wn). Moreover, if wi occurs in W2 . . . Wn (■ times, wi occurs in W1W2 ■ ■ ■ Wn (^+ 1) times. 
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Since ^ is a symmetry, it follows that ^{wi) occurs i times in ^{■W2 ■ ■ ■ Wn) and {i + 1) times in 
*(it;ii(;2 . . .Wn)- These two observations result in 

'^{wiW2 ...Wn)= '^{Wi)^{w2 • • • Wn) = ^(^1)^(^2) • • • ^(w^n) 

or 

^{wiW2 ...Wn) = *(W'2 • • • Wn)^{wi) = ^ {W2) • • • '^{Wn)'^ {wi). 

The first case means that ^ is a morphism. Let us treat the second case. Similar reasoning as 
before leads to 

■^{WIW2 . . . Wn) = -^{Wi . . . Wn-l)-^{Wn) = ^(tyi)^(u)2) • • • ^(u-n) 

or 

^{WIW2 ...Wn) = ^{Wn)'i{wi . . . Wn-l) = ^(tf„)^(wi) . . . 1'(u'„_i). 

The first case again means that ^ is a morphism. The only case which remains is ^{w) = 
^{w2) ■ ■ ■'^{wn)'^{w\) = ^{wn)'^{wi) . . .'^{wn-i)- Since ^ is a bijection, we get wi = W2 = 
■ ■ ■ = Wn- Hence, again ^{w) = ^ {wi)"^ {W2) ■ ■ ■ ^{wn)- 

With the same reasoning, we deduce that if ^{w) = ^{wn-ij ■ ■ ■ ^{w2)'^{wi) for every w = 
wiW2---Wn-i ^ A* , n > 2, then for an arbitrary w = wiW2---Wn £ A*, Wi £ A, we get 

qi(w) = ^{Wn) ■ ■ ■ ^{W2)^iwi). 

n 

Observation 3.4. Let u be an infinite word whose language is invariant under a symmetry ^. 
For every w in C{u) whose frequency exists, it holds 

pH = p(^H). 

Remark 3.5. If a finite set G is a submonoid of AM{A*), then G is a group and any its member 
restricted to the set of words of length one is just a permutation on the alphabet A. In other 
words, G is a finite group of symmetries. Words with languages invariant under all elements of 
such a group G of symmetries have been studied in [13]. 

4. Factor frequencies of languages invariant under more symmetries 

Assume u is an infinite word over an alphabet A with ^A > 2 whose language is invariant 
under all elements of a finite group G C AM (A*) of symmetries containing an antimorphism. 
Let us summarize some observations concerning the group G of symmetries and reduced Rauzy 
graphs of u. These observations constitute all tools we need for the proof of the main theorem 
of this paper - Theorem 4.1. 
Observations: 

(1) Let 9 be an antimorphism in G. The mapping ^ — )■ 9^ is a bijection on G satisfying 

^ G G is a morphism <^ 9'^ € G is an antimorphism. 

This implies that G containing an antimorphism has an even number of elements, i.e., 
#G = 2k. 

(2) For a factor w containing all letters of A, the following properties can be easily verified: 

(a) for any distinct antimorphisms ^1,^2 G G, we have 9i{w) 7^ 92{w), 

(b) for any distinct morphisms 931,932 G G, we have fiiw) 7^ f2{w). 

(3) If w is a ^-palindrome containing all letters of A for an antimorphism 9 £ G, then 9 is 
an involution, i.e., 9'^ = Id. 

(4) In a reduced Rauzy graph of u, if there is an edge e between two vertices w and v, where 
w and V contain all letters of A, then 
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(a) either e is a 0-palindronie for some antimorphism G G, then there exist at least k 
distinct edges having the same label p{e), namely edges 99(e) for all morphisms in 
G; 

(b) or e is not a 0-palindrome for any antimorphism 9 G G, then there exist at least 2k 
distinct edges having the same label p(e), namely edges 93(e) for all morphisms in 
G and 9{e) for all antimorphisms in G. 

(5) On one hand, if an edge e in the reduced Rauzy graph r„ is mapped by 6 onto itself, 
then the corresponding simple path has a ^-palindromic central factor of length n or 
n+ 1. On the other hand, every ^-palindrome contained in Cn+i{u) is the central factor 
of a simple path mapped by 6 onto itself and every 0-palindrome of length n is either 
the central factor of a simple path mapped by 9 onto itself or is a special factor (thus, 
evidently, a BS factor). 

Theorem 4.1. Let G C AM{A*) be a finite group containing an antimorphism. and let u be 
a uniformly recurrent aperiodic infinite word whose language is invariant under all elements of 
G and such that the frequency p{w) exists for every factor w G C{y\.). Then there exists A^ G N 
such that 

#{/9(e) I e G £„+i(u)} < J^(4AC(n) + #G-X-y) foralln>N, 

where X is the number of BS factors of length n and Y is the number of BS factors of length n 
that are 6 -palindromes for an antimorphism 6 £ G. 

Proof Since u is uniformly recurrent, we can find N such that any factor of length N contains 
all letters of u. Let r„ be the reduced Rauzy graph of u of order n > N. We know already that 
the set of edge labels of r„ is equal to the set of edge labels of r„. It is easy to see that any 
element of G is an automorphism of r„, i.e., G maps the graph r„ onto itself. 

Let us denote by A the number of edges e in r„ such that e is mapped by a certain antimor- 
phism of G onto itself (such an antimorphism is involutive by Observation (3)) and by B the 
number of edges e in r„ such that e is not mapped by any antimorphism of G onto itself, then 

(10) #{e I e edge in f„} = A + B < 3AC(n) - X, 

where the upper bound is taken from (6). We get, using Observations (3) and (5), the following 
formula 

(11) A= Y^ {Vein) +Ve{n + 1)) - ^ #{«; G £„(u) | w = e{w) and w BS }, 

where we subtract the number of BS factors of £n(u) that are 0-palindromes for a certain 
antimorphism 9, in the statement denoted by Y , since they are not central factors of any simple 
path. If ^G = 2k, then for every edge e in r„ that is mapped by a certain antimorphism 9 £ G 
onto itself, there are at least k different edges with the same label p{e) by Observation (4a). 

Now, let us turn our attention to those edges of r„ which are not mapped by any antimorphism 
of G onto themselves. For every such edge e, at least 2k edges have the same label p{e) by 
Observation (46). These considerations lead to the following estimate 

(12) #{p{e) I e G /:„+i(u)} < ^A + ^B = ^A + ^{A + B). 

Putting together (11), (10), (12), and Theorem 2.2, the statement is proven. D 

Remark 4.2. If an infinite word u is closed under reversal, then G = {Id, R} and the new 
upper bound from Theorem 4-1 coincides with the estimate from Theorem 2.11. 
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Remark 4.3. It is easy to show that Theorem 4-1 will stay true if we replace the assumption of 
uniform recurrence with the weaker (however more technical) assumption from Remark 2. 3. 

Finally, if we want to have a simpler upper bound on factor frequencies, we can use the 
following one, which is slightly rougher than the estimate from Theorem 4.1. 

Corollary 4.4. Let G C AM [A*) he a finite group containing an antimorphism and let u he 
a uniformly recurrent infinite word whose language is invariant under all elements of G and 
such that the frequency p{w) exists for every factor w G -C(u). Then there exists N £ N such 
that 

#{p(e) I e G £„+i(u)} < ^AC(n) + l foralln>N. 
The equality holds for all sufficiently large n if and only if u is purely periodic. 

5. OpTIMALITY of THE UPPER BOUND 

In this section, we will illustrate on an example taken from [13] that the upper bound from 
Theorem (4.1) is attained for every n G N, n > 1, thus it is an optimal upper bound. The 
infinite word u in question is the fixed point starting in 0, which is obtained when we iterate 
the primitive morphism ip given by: 



(13) 



99(0) = 0130, (p{l) = 1021, 99(2) = 102, 99(3) = 013, 



i.e., for all n G N, the word 99"" (0) is a prefix of u. 
The corresponding incidence matrix is of the form 



/2 


1 


1 


1\ 


1 


2 


1 


1 





1 


1 






\l 1/ 

its dominant eigenvalue is A = 2 + ^/S with the corresponding normalized eigenvector 



Vs- 1 

2-V3 

\2-V3J 



hence we get the letter frequencies 

p(0) = p(l) 



Vs-i 



p{2) = p{3) 



V3 



2 ' ' ^ ' ' ' ' 2 

We also know that the frequencies of all factors exist because of the primitivity of (p. In [13], 
the following properties of u have been shown: 

(1) The language C{u) is closed under the finite group of symmetries G = {Id, 9i,02, ^1^2}) 
where ^1,^2 are involutive antimorphisms acting on A as follows: 

(9i :0^ 1,1 ^0,2^2,3^3 and 6*2:0^0,1^1,2^3,3^2. 

(2) The first increment of factor complexity satisfies AC{n) = 2 for all n G N, n > 1. 
Moreover, every LS factor w is a prefix for some n G N, 

• of either (/^"(O) = 013010210130130 ... and Lext(w) = {1, 3} 

• or of (/9"(1) = 102101301021021 ... and Lext(w) = {0, 2}. 

(3) A factor ttJ of u is LS if and only if 9i{w) is RS for i G {1,2}. 
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In order to find the set of frequencies of factors of any lengtli, we need to describe BS factors 
of u. By Property (3), we deduce the relation between BS factors and ^j-pahndromes. 

Corollary 5.1. Every nonempty BS factor is a 9i-palindrome for one of the indices i G {1, 2}. 

Proposition 5.2. If v G /^(u) is a BS factor of length greater than 5, then v = <p{w)pw„, where 
Wn is the last letter of w and Po = P2 = 10210 and Pi = Ps = 01301. Moreover, 

Proof. By Property (2), every LS factor of length greater than 5 starts either in 01301 or in 10210. 
Similarly, by Property (3), every RS factor ends either in 01301 or in 10210. It follows from the 
definition of f in (13) that there exists w £ ^(u) such that v = {p{w)01301 or f = {p{w)W210 
and that w is necessarily a BS factor. Consider v = 99(tt;)01301, the second case can be treated 
analogously. It is then not difficult to see that w ends in Wn = 1 or Wn = 3, hence pw^ = 01301. 
In order to prove the relation between frequencies, we need to determine the set of interpretations 
of V. It is readily seen that the set of interpretations is 

• {(u;01,0,3),(w02,0,2),(w30,0,2)} if u;„ = 1 or Wn = 3, 

• {{wlO, 0, 3), {wis, 0, 2), (u;21, 0, 2)} if u-^ = or Wn = 2. 

Using Proposition 2.4, we obtain p{v) = ^-^ — ^^ — - = ^-^ if Wn = 1 or Wn = 3, where 

the last equality follows from the fact that w is always followed by 01,02, or 30, and similarly, 
^(^^ ^ p(«,io)+pM3)+p(«,21) = pM if ^^.^ = or t/;„ = 2. □ 

Proposition 5.2 implies that if we want to generate all BS factors of u, then it is enough to 
know BS factors of length less than or equal to 5 and to apply the mapping w — )■ ^{w)pw„ on 
them repeatedly. Nonempty BS factors of length less than or equal to 5 are: 

(1) and 1, 

(2) 01 and 10, 

(3) 01301 and 10210. 

The aim of the rest of this section is to show that for any length n G N, n > 1, we have 

2 if /2„(u) contains a BS factor. 



#Me)|eG£„+i(u)}^^ 3 otherwise. 

Let us draw in Figure 1 reduced Rauzy graphs containing short BS factors. In order to 
describe factor frequencies, it suffices to consider reduced Rauzy graphs containing short BS 
factors together with the following observations concerning reduced Rauzy graphs of u. 

Observation 5.3. (1) Any reduced Rauzy graph has either four vertices (two LS factors 

and two RS factors) or two vertices (BS factors). 

(2) Reduced Rauzy graph of larger order than 5 whose vertices are BS factors are obtained 
from the graphs in Figure 1 by a repeated application of the mapping w — )■ <fiw)pw„ 
simultaneously to all vertices and edges. 

(3) By Corollary 2. 7, it is not difficult to see that if we find to a reduced Rauzy graph r„ 
whose vertices are not BS factors the reduced Rauzy graph of minimal larger order, say 
Tm, whose vertices are BS factors, then 

{p{e) I e edge in r„} = {/9(e) | e edge in Tm} U {p{v) \ v vertex in Fm}- 

The last step in the derivation of frequencies of factors of u is to determine the frequencies 
of edges and vertices in the reduced Rauzy graphs depicted in Figure 1. In the sequel, we make 
use of Kirchhoff's law for frequencies (Observation 2.5), of the fact that symmetries preserve 
factor frequencies, and of the formula from Proposition 2.4. 
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Figure 1. Reduced Rauzy graphs of u of order n £ {1, 2, 5}. 

(1) fi: 

p{0)=p{l) = ^ = ^ 

p(130) = p(021) = p(2) = ^ = ^ 

p(01)=p(10) = p(0) - p(130) = ^ 

In the second row, the first equahty follows from the fact that symmetries preserve 
frequencies and 130 = ^2(021) and the second equality by Corollary 2.6 from the fact 
that 2 is neither LS, nor RS. In the third row, the first equality is again due to symmetries 
and the second uses Kirchhoff 's law for frequencies from Observation 2.5. 

(2) fs: 

p(01) = p(10) = ^ 
p(01301) = /9(10210) = /9(130) = ^ 

p(OlO) = p(lOl) = p(01) - />(01301) = 4^ 

(3) fg: 

/9(01301) = /9(10210) = ^ 
;9M0)10210)=pM1)01301) = B(f = VI_i 
p(01301301) = p(10210210) = p(01301) - p{ip{0)102W) = ^^ = ^ 

Putting together Proposition 5.2, properties of reduced Rauzy graphs summarized in Observa- 
tion 5.3, and the knowledge of frequencies of vertices and edges in Pi, r2, and P5, we obtain the 
following corollary. 

Corollary 5.4. Let n € N, n > 1, such that 

(1) Cnivi) contains a BS factor: then there exists /c G N such that the set {/o(e) | e G £„+i(u)} 
is of one of the following forms: 

(b) \ ^ v^-M 
(r) r v^-1 1 -I 
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(2) Cn{u) does not contain a BS factor: then there exists k £N such that the set {p{e) \ e G 
£„+i(u)} is of one of the following forms: 

(«/ 1 2A'= ' 2A'=+i ' 2A'=+i J' 
(o) I2F+T) IF+Ti 2A^+t}; 

fr) r 1 v^-1 1 1 
(V l2A'=+i' 2A'=+i' 2A'=+2 J- 

A direct consequence of the previous corollary is the optimality of the upper bound from 
Theorem 4.1. 

Proposition 5.5. Let u be the fixed point of ip defined in (13). Then for every n £ N,n > 1, it 
holds 

#{p{e) I e G £„+i(u)} = J^(4AC(n) + #G - X - y), 

where X is the number of BS factors of length n and Y is the number of BS factors of length n 
that are 9i- or 92-palindromes. 

Proof Let us consider at first n such that £n(u) does not contain a BS factor. Then, on one 
hand, Corollary 5.4 states that ^{p{e) \ e G £„+i(u)} = 3. On the other hand, ^^( 4AC(n) + 

^G—X — Yj = "^ 4~ = 3. At second, let £n(u) contain a BS factor. Then, on one hand, we 

have by Corollary 5.4 #{p(e) | e G >C„+i(u)} = 2. On the other hand, by (1) of Observation 5.3, 
£„(u) contains 2 BS factors, and by Corollary 5.1, one BS factor is a ^i-palindrome and one BS 
factor is a ^z-pahndrome, thus ^ UAC{n) + #G-X -y) = 4-2+4-2-2 ^2. □ 

Remark 5.6. There are also infinite words whose language is invariant under elements of 
a finite group of symmetries, however, the upper bound from Theorem 4-1 is not reached for 
any n G N. Such an example is the famous Thue-Morse word. Its group of symmetries G = 
{Id, R, ^, ^ o R}^ where ^ is a morphism acting on {0, 1} as follows: 

^ :0^ 1, 1^0. 

As shown by Dekking [8], the Thue-Morse word utm satisfies for n G N, n > 1, 

,, . . ^ I ^ , S-, ( 1 if M-TM contains a BS factor of length n, 

#{p(e) I e G /:„+i(utm)} = | ^ otherwise. 

But, the upper bound from Theorem 4-1 is of the following form for n G N, n > 1, 

2 or 4 if utm contains a BS factor of length n, 

3 or 5 otherwise. 
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